Survival benefits of perioperative chemoradiotherapy versus chemotherapy for advanced stage gastric cancer based on directed acyclic graphs

The overall survival benefits of perioperative chemotherapy (PCT) and perioperative chemoradiotherapy (PCRT) for patients with locally advanced gastric cancer (GC) have not been fully explored. The aim of this study was to compare the benefits of PCT and PCRT in GC patients and determine the factors affecting survival rate using directed acyclic graphs (DAGs). The data of 1,442 patients with stage II-IV GC who received PCT or PCRT from 2000 to 2018 were retrieved from the Surveillance, Epidemiology, and End Results (SEER) database. First, the least absolute shrinkage and selection operator (LASSO) was used to identify possible influencing factors for overall survival. Second, the variables that were selected by LASSO were then used in univariate and Cox regression analyses. Third, corrective analyses for confounding factors were selected based on DAGs that show the possible association between advanced GC patients and outcomes and evaluate the prognosis. Patients who received PCRT had longer overall survival than those who received PCT treatment (P = 0.015). The median length of overall survival of the PCRT group was 36.5 (15.0 − 53.0) months longer than that of the PCT group (34.6 (16.0 − 48.0) months). PCRT is more likely to benefit patients who are aged ≤ 65, male, white, and have regional tumors (P<0.05). The multivariate Cox regression model showed that male sex, widowed status, signet ring cell carcinoma, and lung metastases were independent risk factors for a poor prognosis. According to DAG, age, race, and Lauren type may be confounding factors that affect the prognosis of advanced GC. Compared to PCT, PCRT has more survival benefits for patients with locally advanced GC, and ongoing investigations are needed to better determine the optimal treatment. Furthermore, DAGs are a useful tool for contending with confounding and selection biases to ensure the proper implementation of high-quality research.


Introduction
Gastric cancer (GC) remains a prevalent human malignant disease and the fourth leading cause of cancer-related death globally [1]. In the United States, approximately 11,140 patients died from this disease in 2019 [2]. In Western countries, more than half of patients have a locally advanced stage at initial diagnosis [3]. Because advanced-stage GC has largely heterogeneous characteristics, an uncertain mechanism, limited therapy strategies, and a less than 12-month median survival time [3][4][5], it is essential to develop optimal individualized management strategies for such patients.
Gastrectomy remains the primary treatment for locally advanced GC [6]. However, satisfactory outcomes cannot be achieved by surgical resection alone, and the five-year survival rate is only 20%-50%, leading to efforts to improve the survival of these patients receiving neoadjuvant or adjuvant therapies [7]. The MAGIC trial and the FLOT4 trial were milestone studies that showed superior survival in patients who received perioperative chemotherapy (PCT) compared to surgery alone for GC [8][9][10]. In America, the use of PCT in patients with T2+ gastric adenocarcinoma in America increased from 34% in 2006 to 65% now, thus, becoming a standardized treatment for GC [11]. Taking into account the high local recurrence and metastasis rates in patients with GC, chemotherapy (CT) combined with radiotherapy (RT) has been proposed and compared with CT in several clinical trials. The INT0116 trial revealed significant survival benefits of adjuvant chemoradiotherapy (CRT) for GC patients after surgery [12]. Additionally, the results of the POET trial demonstrated that the inclusion of RT in preoperative treatment led to survival advantages [13]. On the other hand, the CRITICS trial showed that in patients who underwent preoperative CT, postoperative CRT did not improve survival when compared to postoperative CT [14]. The ARTIST trial also showed that adjuvant RT combined with CT did not have a positive influence on patient survival [15]. A growing body of evidence suggests that the careful assessment of the survival benefits of adjuvant and neoadjuvant CRT is necessary for treatment strategy selection. Until now, evidence comparing PCT and PCRT has been limited to several small, randomized trials that have been conducted in East Asian countries rather than North American countries [16][17][18]. In Europe and the United States, it remains unclear whether PCT should be used in combination with radiotherapy and whether chemoradiotherapy is efficacious. Therefore, screening advanced patients to determine if they can receive PCRT is extremely critical to improving survival rates.
Determining causal relations and eliminating bias are the primary aims of research conducted by the scientific community. To date, observational research has primarily focused on relationships between covariates and outcomes, with authors only rarely asserting a direct causal relationship. Because randomization has been regarded as the most reliable strategy for eliminating bias among treatment groups, statements of direct causal relationships have only been made in the context of randomized trials. Although randomized clinical trials are useful for addressing bias, they are expensive, time-consuming, frequently unfeasible, and unrepresentative of the intended audience when blinding methods are used [19,20]. Given that observational research has enormous unrealized potential to guide clinical decision-making, structured approaches to examine causal connections in specified datasets are being increasingly noticed [21]. Directed acyclic graphs (DAGs) are visual representations of causality in scientific discussions and are being increasingly used in modern epidemiology. Pearl and Spirtes popularized techniques of causal inference using DAGs in 1995 [22], but its use in dental research was only first advocated in 2002, and several subsequent studies have been directed toward theoretical exploration rather than clinical practice [23,24]. This visual aid helps to explicitly describe the underlying relationships defined in the scientific discussion. Therefore, the purpose of the current study was to compare the predicted length of survival of advanced GC patients who received PCT with that of patients who received PCRT. Furthermore, we screened features related to PCT and PCRT in patients with advanced GC and drew a DAG to identify and control potential confounders so that clinicians can select a more appropriate treatment strategy for these patients.

Population
The available data from a retrospective cohort study were extracted using SEER*stat (8.3.6) software. In our study, we used the SEER database's tumor nomenclature and coding manual [25] as well as the International Classification of Diseases tumor morphology code ICD-O-3 to extract data from GC patients treated between 2004 and 2018 [26]. All the data used in this study were openly accessible and retrieved from the SEER database. TNM staging was used to classify all cancer samples according to the American Joint Committee on Cancer (AJCC). Inclusion criteria: (1) patients who were diagnosed from 2004 to 2018 with stage II − IV; (2) patients in whom the primary site of GC was the stomach; (3) patients for whom the exact treatment strategy was PCT or PCRT; and (4) patients with a pathologically confirmed diagnosis of GC. Ineligible cases with unknown or missing characteristic data were excluded. S1

Study variables
The following variables were extracted from the SEER cohort: sex, age, race, marital status, primary site, histologic type, T stage, N stage, M stage, tumor size, differentiation, summary stage, Lauren type, bone metastases, brain metastases, liver metastases, lung metastases, and comprehensive treatment. The information for metastatic sites of the bone or brain or liver and lung (SEER Combined Mets at DX-bone or brain or liver and lung) and comprehensive treatment (CT, RT, systemic therapy) were collected in 2010; thus, metastatic GC and systemically treated patients diagnosed with stage II−IV diseases from 2010 to 2015 were included. The continuous variable was transformed into a categorical variable, such as age (� 65 and > 65 years old) and tumor size (� 5 cm and > 5 cm). The summary stage was categorized into regional, distant, and localized. Primary sites were divided into five subsites as follows: cardiac and fundus, body, antrum and pylorus, lesser and greater curvature, and others. The tumors were pathologically categorized into poorly differentiated, moderately differentiated, well undifferentiated, and undifferentiated. The histological types were categorized into intestinal types, diffuse types, mixed types, and other types. The pathology types were divided into adenocarcinoma and signet ring cell. The primary outcome of the study was overall survival (OS), which was calculated from the date of diagnosis until the date of any cause of death or a follow-up termination event.

Defining covariates for a directed acyclic graph
DAGs comprises a series of nodes representing variables and arrows representing causal relationships between different variables. The nodes are selected based on the prognostic factors included in the Cox proportional hazards model for functional outcomes in advanced GC patients. The arrow's direction is based on recent literature or a priori knowledge. To orient a series of arrows to facilitate causal interpretation, we applied the following prior knowledge-based constraints: adjuvant concurrent CT or CRT is the standard care for patients with resected advanced GC [14]. Therefore, the primary survival outcomes were specified as a sink with only inward-pointing arrows, and the secondary outcome measure treatment including PCT and PCRT was specified as a source with only outward-pointing arrows. Moreover, many nonmodifiable variables, such as age, sex, race, marital status, histologic type, Lauren type, and lung metastases, are also related to advanced GC. A study pointed out a direct association between age and adenocarcinoma and intestinal-type metastasis. The clinicopathological characteristics showed that there were more metastatic diseases in the young patients and more intestinal types in the old patients, and the majority of them were male [27,28]. Another study reported that age and sex may be modifiers of the effects of adjuvant CRT [29]. Furthermore, GC is a phenotypically highly heterogeneous disease that may exhibit a variety of biological behaviors, as patients with intestinal-type GC had better overall survival than those with diffuse-type and mixed-type GC [5,28]. Patients exhibit different sensitivities to CT or CRT according to Lauren's classification [11,30,31]. Thus, it is difficult to select the optimum treatment. Marital status may also affect the prognosis in GC patients; in particular, marriage plays a positive role [32]. Studies also indicated that unmarried or widowed patients with various tumor types were at a high risk of metastatic presentation and had shorter survival [32][33][34]. By applying the tetrad software with these constraints to the information, the DAG model was generated.

Statistical analyses
First, the LASSO was used to identify possible influencing factors for overall survival. The "glmnet" package was used to perform the Lasso regression model analysis. The baseline characteristics of the training set and the validation set, randomly divided by lasso regression, were analyzed using the χ2 test or Fisher's exact probability method. Second, the variables of statistical significance were selected by the LASSO from the training set and then used in the univariate analysis. Survival curves were estimated using the Kaplan-Meier method, and the log-rank test was used to determine survival differences between the groups. Third, independent risk factors that affected OS in advanced GC patients were determined in a Cox regression model. Relative risks were estimated by calculating the hazard ratio (HR) and 95% confidence intervals (CIs). Variables by multivariate analysis were incorporated into nomograms that were constructed as visual graphics. Finally, corrective analyses for confounding factors were selected based on DAGs that showed the possible association between advanced GC patients, primitive tumors, comprehensive treatment, and outcomes. The p values were derived from two-tailed tests, and p values< 0.05 was considered statistically significant. Statistical analysis was performed with SPSS software version 23.0 (SPSS, Inc., Microsoft, Chicago IL, USA) and R studio software (version 3.6.1; https://www.rstudio.com). The DAG was drawn by using tetrad 6.9.0 web-based software (Tools -Center for Causal Discovery (pitt.edu)).

Patient's baseline characteristics (overall sample)
A total of 1422 advanced GC patients, including 289 females and 1153 males, were included in the current study, including 410 patients who received PCT and 1032 patients who received PCRT. After random sampling at a ratio of 2:1, 1022 and 420 patients were included in the training set and validation set, respectively. The mean age was 61±11 years. In most of the patients, 1174 patients (81.4%), the tumor was located at the cardiac or fundus, and regional tumors were reported in 1106 patients (76.7%). There were 822 patients with poor differentiation, accounting for 57.0% of all patients, and 1233 patients with an advanced T stage (T3-T4), accounting for 85.5% of all patients. The characteristics of all the GC patients who met the inclusion and exclusion criteria are shown in S1 Fig, and the data of the patients in the two sets are presented in S1 Table.

Feature selection
In all 18 associated characteristic variables, 16 potential predictors were considered from the cohort data (S2 Fig) and were retained with nonzero regression coefficients in the LASSO algorithm. K cross-validation for centralization and normalization of included factors was performed 10 times and then the best lambda value was chosen. The best tuning parameter lambda for the LASSO regression was 0.0018 when the partial-2 log-likelihood binomial deviance reached its minimum value. The area under the receiver operating characteristic (ROC) curve was used to provide good discrimination for the quality of the model by the lasso regression to separate true positives from false positives. Then, all the selected variables had significant differences and were applied to develop the nomogram models. The nomogram in this study only presents the independent risk factors in the multivariate analysis.

Survival by treatment groups
To identify the prognostic factors related to overall survival, subgroup analyses stratified by comprehensive treatment were performed. The results demonstrated that the survival of the PCRT group was significantly better than that of the PCT group in the training set (Table 1). Patients who received PCT had a longer OS than those who received PCRT treatment (hazard ratio = 0.846, 95% CI = 0.738-0.970, P = 0.015). The median length of overall survival of the PCRT group was 36.5 (15.0 − 53.0) months longer than that of the PCT group (34.6 (16.0 − 48.0) months). In comparison to PCT, PCRT benefits patients who are aged � 65, male, white, and have regional tumors (P<0.05). These results indicated that PCRT benefitted patients with advanced GC in terms of survival.

Univariate and multivariate analyses
In the univariate analysis, the factors significantly associated with advanced GC were marital status, primary site, histology type, marital status, TNM stage, differentiation, summary stage, lung metastases, liver metastases, and comprehensive treatment (Fig 1). In the multivariate Cox regression model, male sex, widowed status, signet ring cell carcinoma, and lung metastases were considered to be independent risk factors for a poor prognosis (Fig 2). PCRT was still significantly associated with better survival in patients with advanced GC (HR = 0.862, 95% CI = 0.744-0.996, P = 0.044).

Directed acyclic graph analysis
Corrective analyses for variables in the multivariate model were selected based on DAGs that showed the possible relationship with advanced GC patients (Fig 3). Based on our univariate and multivariate analyses, the following variables were included in the DAG analysis: sex, age, race, marital status, histologic type, Lauren type, lung metastases, and comprehensive treatment.
Although DAGs presented in previous literature can intuitively reflect the impact of each variable on the survival outcome of advanced GC patients, it is not clear which relationship is sufficiently meaningful, so it is essential to test the DAGs (Table 2). Five variables directly influenced the primary survival outcome: sex (r = -0.0676, HR = 1.032), marital status (r = 0.0268, HR = 1.021), histology type (r = 0.0565, HR = 1.015), lung metastases (r = 0.3200,

Discussion
In this cohort study, the SEER database was used to retrospectively analyze the demographic and clinical characteristics of patients with locally advanced GC to evaluate the effect of PCT  and PCRT on the prognosis of such patients. The results indicated that PCRT had significant survival benefits when compared to PCT for patients with advanced GC. In a study using the SEER database, 21,447 patients with stage I-IV GC benefited most from adjuvant RT with CT when compared with surgery or PCT [35]. In the multivariate Cox regression model, male sex, widowed status, signet ring cell carcinoma, and lung metastases were considered to be independent risk factors for a poor prognosis. Sex, race, and Lauren type were associated with the primary survival outcome, but these factors were not associated with the secondary outcome measure treatment by DAG methods, which indicated that these may be confounding factors by comparing the results with the Cox regression analysis. The prognosis of CRT in GC patients has been reported in several previous randomized studies. The INT0116 study suggested that adjuvant CT combined with RT was effective for individuals with specific treatment modalities and disease pathological stages [12]. Findings from our study indicate that PCRT benefits patients who are age � 65, male sex, white race, or have regional tumors. An analysis of the SEER database from 2005 to 2016 confirms earlier findings that age is a poor prognostic factor and that PCT and age greater than 60 years are related to a worse outcome. Patients who were 60 years or older had a 5-year OS that was nearly 30% lower than those who were younger. In addition, there is an ethnic disparity in PCT use and outcomes among GC patients in the United States. After controlling for patient/ disease/hospital factors, race was independently associated with less PCT use [11]. There was   no difference in OS between patients of different ethnicities. Another study demonstrated that GC patients' comorbidity profiles varied racially and ethnically from those of the matched cancer-free group. In particular, Whites and Blacks had higher rates of comorbid conditions than Asians or Pacific Islanders [36]. The study also discovered that anemia was the most prevalent organ, blood, and metabolic condition among non-White GC patients who were treated with CT within 6 months after diagnosis. The incidence rates were more than twice as high among Chinese, Japanese, and Filipinos as they were among Whites, Blacks, and Hispanics. The optimal therapy for locally advanced GC is perioperative multidisciplinary treatment, including PCT, RT, immunotherapy, and targeted therapy; therefore, comprehensive treatment is paramount to treatment selection [37]. In the West, PCT and PCRT are recommended for resectable patients with regional tumor invasion into the nearby tissues or middle lymph node metastasis, since their SWOG/INT-0116 trial demonstrated that the OS and recurrence-free survival (RFS) of GC patients who have had R0 resection followed by PCRT (45 Gy of RT combined with bolus fluorouracil [FU] and leucovorin) were longer than the OS and RFS of those who had surgery alone (5-year OS, 40% vs. 30%, respectively; 5-year RFS, 48% vs. 31%, respectively) [38]. These differences, such as age, race, and the presence of a regional tumor, may have an impact on the curative effect of PCRT.
The improved survival observed with PCT could be attributed to tumor regression. Multiple trials have shown that PCRT reduces the primary tumor volume and the local tumor recurrence rate and increases the rates of pathological complete response, negative margin resection, and overall survival [8,37]. A study from South Korea showed that progression-free survival (PFS) in patients who received neoadjuvant therapy (TNT) was significantly improved when compared with that in patients who received neoadjuvant chemoradiotherapy (NCRT), and the OS tended to be longer without any side effects [39]. The Neo-PLANET study analyzed the safety and efficacy of ocrelizumab combined with CRT in the neoadjuvant treatment of locally advanced proximal gastric adenocarcinoma. The interim analysis showed a 91.7% R0 resection rate, 12 patients achieved pathological complete response (PCR, 33.3%), and the rate of major pathological response (MPR) was 41.7% [40]. Additionally, there is a potential benefit to treating micrometastatic disease and decreasing tumor cell spread following resection [8]. However, subsets of individuals might not benefit more from CRT owing to poor tolerance and limited toxicity. A study showed that up to 50% of patients who underwent surgical resection never received CRT, presumably owing to poor tolerance, whereas neoadjuvant therapy may be better tolerated and more reliably applicable [41]. In recent years, technical advances have enabled oncologists to deliver precise doses of RT and minimize exposure to critical organs, resulting in improved therapeutic effects and decreased toxicity.
For reliable causal inferences to be drawn from observational data, confounding needs to be appropriately addressed because it hides the true effect of the exposure. DAG analysis shows that age, race, and Lauren type are integral to the impact of comprehensive treatment, meaning that these factors indirectly affect the prognosis of patients with advanced-stage disease because of the direct impact on the treatment modality, so they could cause confounder bias in regression analyses. GC is considered an age-related disease, with the majority of newly diagnosed patients in the United States being over 75 years old. Elderly patients frequently have restricted inclusion in clinical trials owing to the physiological changes that occur with age, including pharmacodynamic variability, diminished organ function, and impaired functional status, which necessitate individualized treatment approaches. A study reported survival advantages in patients younger than 70 years old who received adjuvant CRT but not in those older than 70 years old [30]. Additionally, a study noted that CRT improved the outcomes of patients with the intestinal type when compared with those with the diffuse type. Although the reason for the difficulties in locoregional control of diffuse GC is yet to be discovered, the benefit of local control may be associated with a better survival rate. These differences call for the exploration of multiple approaches in the potentially curative treatment of advanced GC. Therefore, it is possible to consider that age, race, and Lauren type may be confounding factors affecting prognosis in the advanced stage.
DAGs, a visual representation of causal assumptions, are used by researchers to better examine confounding biases related to causal questions [42,43]. DAGs may be preferable to the conventional definition of confounding, particularly in more complicated situations, as they enable the identification of the presumptive causal mechanism and, consequently, the possibility of collider-stratification bias with certain adjustments, as well as a minimal set of factors to adjust for to remove unwanted confounding [44]. Several tutorials and reviews on the use of DAGs have been published to aid technicians and clinicians [45,46]. Lederer et al. reviewed the general concepts, including confounding and selection bias, for researchers in pulmonary medicine [47]. By doing so, a methodical technique to deliver a summary of the context and the causal research is presented. DAGs clarify the underlying relationships and act as a visual representation of causal assumptions [48]. DAGs can thus be used to clarify confusion and suggest solutions.
Although we successfully constructed a DAG to control the bias among patients with GC after surgical resection, our study has several limitations. First, as a retrospective study, the inherent risk of selection bias is inevitable. Second, detailed treatment information is not included in the SEER database, such as the proportion of D2 lymphadenectomy, the extent of the RT field, and CT regimens. Third, although the model still performed well, there was no external validation from other larger institutions. Fourth, the accuracy of the DAGs may be compromised if a causal association between two factors is misrepresented.

Conclusion
In conclusion, this large cohort from the SEER database revealed that PCRT has better survival benefits than PCT for GC patients with advanced-stage disease. PCRT benefits patients who are age � 65, male, white, or have regional tumors. Hence, PCRT may be feasible for these patients. Moreover, multivariate and DAG analyses showed that sex, marital status, histologic type, lung metastases, and comprehensive treatment were associated with survival in advanced-stage patients. Furthermore, DAGs shows that age, race, and Lauren type may be confounding factors that affect prognosis in patients with advanced stage GC. DAGs, as a useful tool for contending with confounding and selection biases, are integral to the proper implementation of high-quality research.